AITopics | orthogonal convolution

2 Background

Neural Information Processing SystemsFeb-11-2026, 15:17:40 GMT

Inprinciple, onecandesign Lipschitz constrained architectures using the composition property of Lipschitz functions, but Anil et al.[2] recently identified a key obstacle to this approach: gradient norm attenuation.

artificial intelligence, arxivpreprintarxiv, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Canary Islands (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

5d1a0188e18c1d74a0f8d6eb5ecede4f-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 07:06:48 GMT

approximation, convolution, matrix, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

5d1a0188e18c1d74a0f8d6eb5ecede4f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 07:06:44 GMT

arxiv preprint arxiv, matrix, neural network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks

Neural Information Processing SystemsOct-2-2025, 07:22:28 GMT

We extend their approach to train scalable, expressive, provably Lipschitz convo-lutional networks.

artificial intelligence, convolution, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

5d1a0188e18c1d74a0f8d6eb5ecede4f-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 03:45:32 GMT

approximation, convolution, matrix, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

5d1a0188e18c1d74a0f8d6eb5ecede4f-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 03:45:28 GMT

arxiv preprint arxiv, matrix, neural network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Robustness as Architecture: Designing IQA Models to Withstand Adversarial Perturbations

Meleshin, Igor, Chistyakova, Anna, Antsiferova, Anastasia, Vatolin, Dmitriy

arXiv.org Artificial IntelligenceJun-6-2025

Image Quality Assessment (IQA) models are increasingly relied upon to evaluate image quality in real-world systems -- from compression and enhancement to generation and streaming. Yet their adoption brings a fundamental risk: these models are inherently unstable. Adversarial manipulations can easily fool them, inflating scores and undermining trust. Traditionally, such vulnerabilities are addressed through data-driven defenses -- adversarial retraining, regularization, or input purification. But what if this is the wrong lens? What if robustness in perceptual models is not something to learn but something to design? In this work, we propose a provocative idea: robustness as an architectural prior. Rather than training models to resist perturbations, we reshape their internal structure to suppress sensitivity from the ground up. We achieve this by enforcing orthogonal information flow, constraining the network to norm-preserving operations -- and further stabilizing the system through pruning and fine-tuning. The result is a robust IQA architecture that withstands adversarial attacks without requiring adversarial training or significant changes to the original model. This approach suggests a shift in perspective: from optimizing robustness through data to engineering it through design.

artificial intelligence, machine learning, robustness, (19 more...)

arXiv.org Artificial Intelligence

2506.04951

Country: Europe > Russia (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (0.37)
Government > Military (0.37)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures

Boissin, Thibaut, Mamalet, Franck, Fel, Thomas, Picard, Agustin Martin, Massena, Thomas, Serrurier, Mathieu

arXiv.org Artificial IntelligenceJan-14-2025

Orthogonal convolutional layers are the workhorse of multiple areas in machine learning, such as adversarial robustness, normalizing flows, GANs, and Lipschitzconstrained models. Their ability to preserve norms and ensure stable gradient propagation makes them valuable for a large range of problems. Despite their promise, the deployment of orthogonal convolution in large-scale applications is a significant challenge due to computational overhead and limited support for modern features like strides, dilations, group convolutions, and transposed convolutions.In this paper, we introduce AOC (Adaptative Orthogonal Convolution), a scalable method for constructing orthogonal convolutions, effectively overcoming these limitations. This advancement unlocks the construction of architectures that were previously considered impractical. We demonstrate through our experiments that our method produces expressive models that become increasingly efficient as they scale. To foster further advancement, we provide an open-source library implementing this method, available at https://github.com/thib-s/orthogonium.

artificial intelligence, convolution, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.0793

Country: Europe (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

projUNN: efficient method for training deep networks with unitary matrices

Kiani, Bobak, Balestriero, Randall, LeCun, Yann, Lloyd, Seth

arXiv.org Artificial IntelligenceOct-13-2022

In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability. However, restricting network parameters to be unitary typically comes at the cost of expensive parameterizations or increased training runtime. We propose instead an efficient method based on rank-$k$ updates -- or their rank-$k$ approximation -- that maintains performance at a nearly optimal training runtime. We introduce two variants of this method, named Direct (projUNN-D) and Tangent (projUNN-T) projected Unitary Neural Networks, that can parameterize full $N$-dimensional unitary or orthogonal matrices with a training runtime scaling as $O(kN^2)$. Our method either projects low-rank gradients onto the closest unitary matrix (projUNN-T) or transports unitary matrices in the direction of the low-rank gradient (projUNN-D). Even in the fastest setting ($k=1$), projUNN is able to train a model's unitary parameters to reach comparable performances against baseline implementations. In recurrent neural network settings, projUNN closely matches or exceeds benchmarked results from prior unitary neural networks. Finally, we preliminarily explore projUNN in training orthogonal convolutional neural networks, which are currently unable to outperform state of the art models but can potentially enhance stability and robustness at large depth.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

2203.05483

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Orthogonalizing Convolutional Layers with the Cayley Transform

Trockman, Asher, Kolter, J. Zico

arXiv.org Machine LearningApr-14-2021

Recent work has highlighted several advantages of enforcing orthogonality in the weight layers of deep networks, such as maintaining the stability of activations, preserving gradient norms, and enhancing adversarial robustness by enforcing low Lipschitz constants. Although numerous methods exist for enforcing the orthogonality of fully-connected layers, those for convolutional layers are more heuristic in nature, often focusing on penalty methods or limited classes of convolutions. In this work, we propose and evaluate an alternative approach to directly parameterize convolutional layers that are constrained to be orthogonal. Specifically, we propose to apply the Cayley transform to a skew-symmetric convolution in the Fourier domain, so that the inverse convolution needed by the Cayley transform can be computed efficiently. We compare our method to previous Lipschitz-constrained and orthogonal convolutional layers and show that it indeed preserves orthogonality to a high degree even for large convolutions. Applied to the problem of certified adversarial robustness, we show that networks incorporating the layer outperform existing deterministic methods for certified defense against $\ell_2$-norm-bounded adversaries, while scaling to larger architectures than previously investigated. Code is available at https://github.com/locuslab/orthogonal-convolutions.

cayley transform, convolution, matrix, (13 more...)

arXiv.org Machine Learning

2104.07167

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback